Abstract
Accurate molecular classification of hematological malignancies is essential for treatment and prognosis. Current diagnostic workflows include a range of tools, including flow-cytometry, karyotyping, NGS, RNA-sequencing and more. These can be time-consuming and expensive, leading to delayed or lacking targeted treatments. DNA-methylation based classification has reshaped molecular diagnosis of central nervous system tumors and sarcomas. Advances in nanopore-based DNA methylation classification have allowed for accurate intra-operative classification of CNS tumors within 90 minutes, assisting decision making during surgery. However, comparably rapid and precise diagnostics for hematologic malignancies have not yet been explored. We therefore set out to develop a clinically deployable framework that couples nanopore sequencing with machine learning to provide high-resolution, real-time classification of hematologic malignancies.
To construct a reference atlas, we integrated >5,400 Illumina 450k/EPIC/EPIC-v2 methylation arrays representing 38 molecular subtypes across the spectrum of hematological disease. Specifically, the atlas spans 18 AML subtypes, 10 B-cell precursor ALL subtypes, and one subtype each of T-ALL, low-risk MDS, non-Ph MPN, JMML, CMML, BPDCN, B-PLL, and CLL, plus two non-neoplastic control classes. Nanopore-style sparse-read simulations were generated to reproduce the coverage and error profile of real-time methylation calling. Lamprey, a neural network with calibrated confidence scoring, was trained on these simulated read sets.
We validated Lamprey's performance on a hold-out test set, and a validation cohort of 49 retrospective samples sequenced on the nanopore using adaptive sequencing. The model achieved a micro F1 score of 0.96 on the hold-out test set and predicted 47/52 retrospective samples correctly. Of those, 41 correct and 4 incorrect predications reached a confidence score above 0.95. We additionally ran nanopore-based structural variant and chromosome abnormality calling, confirming all the previously defined subtypes. The highly confident incorrect predictions consist of near-haploid/low-hypodiploid ALL, a PAX5 altered ALL with BCR-ABL1 like methylation and transcriptional profile, and a KMT2A-PTD AML. There is little training data available for KMT2A-PTD AML and near-haploid/low-hypodiploid ALL, indicating a need for further training data for rare/newly identified subtypes. Within our training dataset we were able to identify groupings within genomic subtypes that aligned with biological characteristics of the samples. For example, three groups of KMT2A-rearranged AMLs were identified, one of which is enriched in acute megakaryoblastic leukemia, and another that shows overlap with NUP98-rearranged and NPM1 subtypes. Furthermore, we are able to distinguish between closely related molecular subtypes, including DEK-NUP214, NUP98-rearranged, NPM1-mutated and KMT2A-rearranged AMLs. Methylation is also able to identify myeloproliferative neoplasms, including JMML and BCR-ABL1-negative myeloproliferative neoplasms, from control samples.
Lamprey is a novel framework for methylation-based rapid molecular diagnosis of hematological disorders. Combining nanopore sequencing with sparse methylation profiling provides the possibility to achieve a fine-grained diagnosis within hours, using a single assay. This could greatly reduce the resources and expertise required for molecular diagnosis and also reduce the need for time-intensive workflows for identifying ‘like’ acute leukemias. Methylation profiling could also lead to the identification of previously undefined subtypes of malignancies, something that has already reshaped CNS-tumor diagnosis.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal